probabilistic modeling
Classification-by-Components: Probabilistic Modeling of Reasoning over a Set of Components
Neural networks are state-of-the-art classification approaches but are generally difficult to interpret. This issue can be partly alleviated by constructing a precise decision process within the neural network. In this work, a network architecture, denoted as Classification-By-Components network (CBC), is proposed. It is restricted to follow an intuitive reasoning based decision process inspired by Biederman's recognition-by-components theory from cognitive psychology. The network is trained to learn and detect generic components that characterize objects.
Expressive power of tensor-network factorizations for probabilistic modeling
Tensor-network techniques have recently proven useful in machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. Inspired by these developments, and the natural correspondence between tensor networks and probabilistic graphical models, we provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. These factorizations include non-negative tensor-trains/MPS, which are in correspondence with hidden Markov models, and Born machines, which are naturally related to the probabilistic interpretation of quantum circuits. When used to model probability distributions, they exhibit tractable likelihoods and admit efficient learning algorithms. Interestingly, we prove that there exist probability distributions for which there are unbounded separations between the resource requirements of some of these tensor-network factorizations. Of particular interest, using complex instead of real tensors can lead to an arbitrarily large reduction in the number of parameters of the network. Additionally, we introduce locally purified states (LPS), a new factorization inspired by techniques for the simulation of quantum systems, with provably better expressive power than all other representations considered. The ramifications of this result are explored through numerical experiments.
A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical Representation Learning
We present a rotated hyperbolic wrapped normal distribution (RoWN), a simple yet effective alteration of a hyperbolic wrapped normal distribution (HWN). The HWN expands the domain of probabilistic modeling from Euclidean to hyperbolic space, where a tree can be embedded with arbitrary low distortion in theory. In this work, we analyze the geometric properties of the diagonal HWN, a standard choice of distribution in probabilistic modeling. The analysis shows that the distribution is inappropriate to represent the data points at the same hierarchy level through their angular distance with the same norm in the Poincar\'e disk model. We then empirically verify the presence of limitations of HWN, and show how RoWN, the proposed distribution, can alleviate the limitations on various hierarchical datasets, including noisy synthetic binary tree, WordNet, and Atari 2600 Breakout.
What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture
Chang, Heng-Sheng, Mehta, Prashant G.
In the 1940s, Wiener introduced a linear predictor, where the future prediction is computed by linearly combining the past data. A transformer generalizes this idea: it is a nonlinear predictor where the next-token prediction is computed by nonlinearly combining the past tokens. In this essay, we present a probabilistic model that interprets transformer signals as surrogates of conditional measures, and layer operations as fixed-point updates. An explicit form of the fixed-point update is described for the special case when the probabilistic model is a hidden Markov model (HMM). In part, this paper is in an attempt to bridge the classical nonlinear filtering theory with modern inference architectures.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Expressive power of tensor-network factorizations for probabilistic modeling
Ivan Glasser, Ryan Sweke, Nicola Pancotti, Jens Eisert, Ignacio Cirac
Many problems in diverse areas of computer science and physics involve constructing efficient representations of high-dimensional functions. Neural networks are a particular example of such representations that have enjoyed great empirical success, and much effort has been dedicated to understanding their expressive power - i.e. the set of functions that they can efficiently represent. Analogously, tensor networks are a class of powerful representations of high-dimensional arrays (tensors), for which a variety of algorithms and methods have been developed.
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > France (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Probabilistic Modeling of Spiking Neural Networks with Contract-Based Verification
Yao, Zhen, De Maria, Elisabetta, De Simone, Robert
Spiking Neural Networks (SNN) are models for "realistic" neuronal computation, which makes them somehow different in scope from "ordinary" deep-learning models widely used in AI platforms nowadays. SNNs focus on timed latency (and possibly probability) of neuronal reactive activation/response, more than numerical computation of filters. So, an SNN model must provide modeling constructs for elementary neural bundles and then for synaptic connections to assemble them into compound data flow network patterns. These elements are to be parametric patterns, with latency and probability values instantiated on particular instances (while supposedly constant "at runtime"). Designers could also use different values to represent "tired" neurons, or ones impaired by external drugs, for instance. One important challenge in such modeling is to study how compound models could meet global reaction requirements (in stochastic timing challenges), provided similar provisions on individual neural bundles. A temporal language of logic to express such assume/guarantee contracts is thus needed. This may lead to formal verification on medium-sized models and testing observations on large ones. In the current article, we make preliminary progress at providing a simple model framework to express both elementary SNN neural bundles and their connecting constructs, which translates readily into both a model-checker and a simulator (both already existing and robust) to conduct experiments.
- Europe > France > Provence-Alpes-Côte d'Azur (0.04)
- North America > United States (0.04)
Addressing Class Imbalance with Probabilistic Graphical Models and Variational Inference
Lou, Yujia, Liu, Jie, Sheng, Yuan, Wang, Jiawei, Zhang, Yiwei, Ren, Yaokun
This study proposes a method for imbalanced data classification based on deep probabilistic graphical models (DPGMs) to solve the problem that traditional methods have insufficient learning ability for minority class samples. To address the classification bias caused by class imbalance, we introduce variational inference optimization probability modeling, which enables the model to adaptively adjust the representation ability of minority classes and combines the class-aware weight adjustment strategy to enhance the classifier's sensitivity to minority classes. In addition, we combine the adversarial learning mechanism to generate minority class samples in the latent space so that the model can better characterize the category boundary in the high-dimensional feature space. The experiment is evaluated on the Kaggle "Credit Card Fraud Detection" dataset and compared with a variety of advanced imbalanced classification methods (such as GAN-based sampling, BRF, XGBoost-Cost Sensitive, SAAD, HAN). The results show that the method in this study has achieved the best performance in AUC, Precision, Recall and F1-score indicators, effectively improving the recognition rate of minority classes and reducing the false alarm rate. This method can be widely used in imbalanced classification tasks such as financial fraud detection, medical diagnosis, and anomaly detection, providing a new solution for related research.
- Banking & Finance (1.00)
- Law Enforcement & Public Safety > Fraud (0.90)
- Information Technology > Security & Privacy (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
- Information Technology > Data Science > Data Mining > Anomaly Detection (0.71)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Reviews: Classification-by-Components: Probabilistic Modeling of Reasoning over a Set of Components
Originality - this research is similar to prototype-based learning of neural networks, but it is the first to propose learning and detecting generic components that characterize object using three different types of reasoning (positive, negative and indefinite). Clarity - the paper is hard to read and follow. There are large chunks of text with no figures or equations to illustrate the concepts. In the supplementary material they provide a lot more information which was left out of the main paper. It does feel like the paper is not self-sufficient, as many important steps are only brushed over, such as the training procedure and how to generate the interpretations.
Reviews: Classification-by-Components: Probabilistic Modeling of Reasoning over a Set of Components
The paper proposes an interesting probabilistic reasoning process that considers the presence or absence of various components (that are indicative of several properties of an instance) and combines them together as (potentially interpretable) evidence for its final classification. The idea seems to us novel and interesting. Multiple experiments are provided to support the approach. The paper is also well-written and clear.
Reviews: Expressive power of tensor-network factorizations for probabilistic modeling
The authors compare the ranks of tensor representations of HMM, and outputs of quantum circuits with two qubit unitary gates yielding Matrix product States (MPS) and so-called Locally Purified States (LPS) when ancillary unmeasured bits are present. A general comment: Born machines automaticaly enforce positivity but is it clear that 83) and (4) are less than 1? The A's come from some unitary circuits in SM? If yes the main problem formulation seems not selfcontained in sect.2. Some are more surprizing namely the very large (at least of the order of the number of qubits) difference in rank when one works in the real field versus complex field.